AITopics | memory structure

Collaborating Authors

memory structure

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

CAM: A Constructivist View of Agentic Memory for LLM-Based Reading Comprehension

Li, Rui, Zhang, Zeyu, Bo, Xiaohe, Tian, Zihang, Chen, Xu, Dai, Quanyu, Dong, Zhenhua, Tang, Ruiming

arXiv.org Artificial IntelligenceOct-8-2025

Current Large Language Models (LLMs) are confronted with overwhelming information volume when comprehending long-form documents. This challenge raises the imperative of a cohesive memory module, which can elevate vanilla LLMs into autonomous reading agents. Despite the emergence of some heuristic approaches, a systematic design principle remains absent. To fill this void, we draw inspiration from Jean Piaget's Constructivist Theory, illuminating three traits of the agentic memory -- structured schemata, flexible assimilation, and dynamic accommodation. This blueprint forges a clear path toward a more robust and efficient memory system for LLM-based reading comprehension. To this end, we develop CAM, a prototype implementation of Constructivist Agentic Memory that simultaneously embodies the structurality, flexibility, and dynamicity. At its core, CAM is endowed with an incremental overlapping clustering algorithm for structured memory development, supporting both coherent hierarchical summarization and online batch integration. During inference, CAM adaptively explores the memory structure to activate query-relevant information for contextual response, akin to the human associative process. Compared to existing approaches, our design demonstrates dual advantages in both performance and efficiency across diverse long-text reading comprehension tasks, including question answering, query-based summarization, and claim verification.

information, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2510.0552

Country: Asia > China (0.14)

Genre: Research Report (0.82)

Industry: Education > Assessment & Standards > Student Performance (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Reviewer # 1

Neural Information Processing SystemsAug-20-2025, 07:37:47 GMT

DQN paper that we applied to all the baselines and our method for the experiment. Therefore, we are certain that we have provided a fair comparison. We apologize for the source of confusion about the update period in Appendix D. We meant "At each " We will correct this to prevent confusion. There have been many recent related works including the ones the Reviewer 1 cited. Unfortunately, we could not provide detailed differences/advances of them all in the limited amount of manuscript.

hyperparameter, reinforcement, reviewer, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.82)

Add feedback

Numerical Investigation of Sequence Modeling Theory using Controllable Memory Functions

Jiang, Haotian, Bao, Zeyu, Wang, Shida, Li, Qianxiao

arXiv.org Artificial IntelligenceJun-10-2025

The evolution of sequence modeling architectures, from recurrent neural networks and convolutional models to Transformers and structured state-space models, reflects ongoing efforts to address the diverse temporal dependencies inherent in sequential data. Despite this progress, systematically characterizing the strengths and limitations of these architectures remains a fundamental challenge. In this work, we propose a synthetic benchmarking framework to evaluate how effectively different sequence models capture distinct temporal structures. The core of this approach is to generate synthetic targets, each characterized by a memory function and a parameter that determines the strength of temporal dependence. This setup allows us to produce a continuum of tasks that vary in temporal complexity, enabling fine-grained analysis of model behavior concerning specific memory properties. We focus on four representative memory functions, each corresponding to a distinct class of temporal structures. Experiments on several sequence modeling architectures confirm existing theoretical insights and reveal new findings. These results demonstrate the effectiveness of the proposed method in advancing theoretical understanding and highlight the importance of using controllable targets with clearly defined structures for evaluating sequence modeling architectures.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2506.05678

Country: North America > United States (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.91)

Add feedback

On the Structural Memory of LLM Agents

Zeng, Ruihong, Fang, Jinyuan, Liu, Siwei, Meng, Zaiqiao

arXiv.org Artificial IntelligenceDec-16-2024

Memory plays a pivotal role in enabling large language model~(LLM)-based agents to engage in complex and long-term interactions, such as question answering (QA) and dialogue systems. While various memory modules have been proposed for these tasks, the impact of different memory structures across tasks remains insufficiently explored. This paper investigates how memory structures and memory retrieval methods affect the performance of LLM-based agents. Specifically, we evaluate four types of memory structures, including chunks, knowledge triples, atomic facts, and summaries, along with mixed memory that combines these components. In addition, we evaluate three widely used memory retrieval methods: single-step retrieval, reranking, and iterative retrieval. Extensive experiments conducted across four tasks and six datasets yield the following key insights: (1) Different memory structures offer distinct advantages, enabling them to be tailored to specific tasks; (2) Mixed memory structures demonstrate remarkable resilience in noisy environments; (3) Iterative retrieval consistently outperforms other methods across various scenarios. Our investigation aims to inspire further research into the design of memory systems for LLM-based agents.

artificial intelligence, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2412.15266

Country:

Asia (0.46)
Europe > Austria (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Media (0.46)
Leisure & Entertainment (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Deep Memory Search: A Metaheuristic Approach for Optimizing Heuristic Search

Hedar, Abdel-Rahman, Abdel-Hakim, Alaa E., Deabes, Wael, Alotaibi, Youseef, Bouazza, Kheir Eddine

arXiv.org Artificial IntelligenceOct-22-2024

Metaheuristic search methods have proven to be essential tools for tackling complex optimization challenges, but their full potential is often constrained by conventional algorithmic frameworks. In this paper, we introduce a novel approach called Deep Heuristic Search (DHS), which models metaheuristic search as a memory-driven process. DHS employs multiple search layers and memory-based exploration-exploitation mechanisms to navigate large, dynamic search spaces. By utilizing model-free memory representations, DHS enhances the ability to traverse temporal trajectories without relying on probabilistic transition models. The proposed method demonstrates significant improvements in search efficiency and performance across a range of heuristic optimization problems.

artificial intelligence, evolutionary algorithm, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2410.17042

Country:

North America > United States (0.46)
Asia > Middle East (0.28)

Genre:

Overview (0.67)
Research Report > Promising Solution (0.34)

Industry: Energy > Oil & Gas > Upstream (0.48)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)

Add feedback

Depression Diagnosis Dialogue Simulation: Self-improving Psychiatrist with Tertiary Memory

Lan, Kunyao, Jin, Bingrui, Zhu, Zichen, Chen, Siyuan, Zhang, Shu, Zhu, Kenny Q., Wu, Mengyue

arXiv.org Artificial IntelligenceOct-9-2024

Mental health issues, particularly depressive disorders, present significant challenges in contemporary society, necessitating the development of effective automated diagnostic methods. This paper introduces the Agent Mental Clinic (AMC), a self-improving conversational agent system designed to enhance depression diagnosis through simulated dialogues between patient and psychiatrist agents. To enhance the dialogue quality and diagnosis accuracy, we design a psychiatrist agent consisting of a tertiary memory structure, a dialogue control and reflect plugin that acts as ``supervisor'' and a memory sampling module, fully leveraging the skills reflected by the psychiatrist agent, achieving great accuracy on depression risk and suicide risk diagnosis via conversation. Experiment results on datasets collected in real-life scenarios demonstrate that the system, simulating the procedure of training psychiatrists, can be a promising optimization method for aligning LLMs with real-life distribution in specific domains without modifying the weights of LLMs, even when only a few representative labeled cases are available.

agent, dialogue, psychiatrist agent, (13 more...)

arXiv.org Artificial Intelligence

2409.15084

Country:

Asia > China > Shanghai > Shanghai (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Texas > Tarrant County > Arlington (0.04)
(4 more...)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.91)

Add feedback

Making Good on LSTMs Unfulfilled Promise

Philps, Daniel, Garcez, Artur d'Avila, Weyde, Tillman

arXiv.org Machine LearningNov-23-2019

LSTMs promise much to financial time-series analysis, temporal and cross-sectional inference, but we find they do not deliver in a real-world financial management task. We examine an alternative called Continual Learning (CL), a memory-augmented approach, which can provide transparent explanations; which memory did what and when. This work has implications for many financial applications including to credit, time-varying fairness in decision making and more. We make three important new observations. Firstly, as well as being more explainable, time-series CL approaches outperform LSTM and a simple sliding window learner (feed-forward neural net (FFNN)). Secondly, we show that CL based on a sliding window learner (FFNN) is more effective than CL based on a sequential learner (LSTM). Thirdly, we examine how real-world, time-series noise impacts several similarity approaches used in CL memory addressing. We provide these insights using an approach called Continual Learning Augmentation (CLA) tested on a complex real world problem; emerging market equities investment decision making. CLA provides a test-bed as it can be based on different types of time-series learner, allowing testing of LSTM and sliding window (FFNN) learners side by side. CLA is also used to test several distance approaches used in a memory recall-gate: euclidean distance (ED), dynamic time warping (DTW), auto-encoder (AE) and a novel hybrid approach, warp-AE. We find CLA out-performs simple LSTM and FFNN learners and CLA based on a sliding window (CLA-FFNN) out-performs a LSTM (CLA-LSTM) implementation. While for memory-addressing, ED under-performs DTW and AE but warp-AE shows the best overall performance in a real-world financial task.

base learner, learner, similarity, (16 more...)

arXiv.org Machine Learning

1911.04489

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Colorado > Denver County > Denver (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(4 more...)

Genre: Research Report (0.67)

Industry: Banking & Finance > Trading (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Simple Strategies in Multi-Objective MDPs (Technical Report)

Delgrange, Florent, Katoen, Joost-Pieter, Quatmann, Tim, Randour, Mickael

arXiv.org Artificial IntelligenceOct-24-2019

We consider the verification of multiple expected reward objectives at once on Markov decision processes (MDPs). This enables a trade-off analysis among multiple objectives by obtaining the Pareto front. We focus on strategies that are easy to employ and implement. That is, strategies that are pure (no randomization) and have bounded memory. We show that checking whether a point is achievable by a pure stationary strategy is NP-complete, even for two objectives, and we provide an MILP encoding to solve the corresponding problem. The bounded memory case can be reduced to the stationary one by a product construction. Experimental results using \Storm and Gurobi show the feasibility of our algorithms.

mdp, nulle null, objective, (15 more...)

arXiv.org Artificial Intelligence

1910.11024

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Germany > North Rhine-Westphalia > Cologne Region > Aachen (0.04)
Europe > Belgium (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.49)

Add feedback

A Dual Memory Structure for Efficient Use of Replay Memory in Deep Reinforcement Learning

Ko, Wonshick, Chang, Dong Eui

arXiv.org Machine LearningJul-15-2019

Replay memory plays an important role in stable learning and fast convergence of deep reinforcement learning algorithms [1] that are methods of approximating a value or a policy function using deep neural networks [2]. The study of replay memory in reinforcement learning started from [3] and played a major role in training reinforcement learning agents to play Atari 2600 games with a Deep Q-Network (DQN) [4]. In addition, replay memory is used in other off-policy reinforcement learning algorithms such as DDPG [5] and ACER [6]. In [7], after analyzing the importance of the data in the replay memory, a probability distribution is assigned to enable efficient learning through prioritization based on the Figure 1: Proposed dual memory structure.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Machine Learning

1907.06396

Country: Asia (0.15)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.37)

Add feedback

Continual Learning Augmented Investment Decisions

Philps, Daniel, Weyde, Tillman, Garcez, Artur d'Avila, Batchelor, Roy

arXiv.org Artificial IntelligenceDec-14-2018

Investment decisions can benefit from incorporating an accumulated knowledge of the past to drive future decision making. We introduce Continual Learning Augmentation (CLA) which is based on an explicit memory structure and a feed forward neural network (FFNN) base model and used to drive long term financial investment decisions. We demonstrate that our approach improves accuracy in investment decision making while memory is addressed in an explainable way. Our approach introduces novel remember cues, consisting of empirically learned change points in the absolute error series of the FFNN. Memory recall is also novel, with contextual similarity assessed over time by sampling distances using dynamic time warping (DTW). We demonstrate the benefits of our approach by using it in an expected return forecasting task to drive investment decisions. In an investment simulation in a broad international equity universe between 2003-2017, our approach significantly outperforms FFNN base models. We also illustrate how CLA's memory addressing works in practice, using a worked example to demonstrate the explainability of our approach.

artificial intelligence, base model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

1812.0234

Country: North America > United States (0.29)

Genre: Research Report (1.00)

Industry: Banking & Finance > Trading (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback